Identifiability of Phylogenetic Parameters from k-mer Data Under the Coalescent.
Identifieur interne : 000539 ( Main/Exploration ); précédent : 000538; suivant : 000540Identifiability of Phylogenetic Parameters from k-mer Data Under the Coalescent.
Auteurs : Chris Durden [États-Unis] ; Seth Sullivant [États-Unis]Source :
- Bulletin of mathematical biology [ 1522-9602 ] ; 2019.
Descripteurs français
- KwdFr :
- MESH :
English descriptors
- KwdEn :
- MESH :
- statistics & numerical data : Sequence Alignment.
- Algorithms, Computational Biology, Evolution, Molecular, Markov Chains, Mathematical Concepts, Models, Genetic, Models, Statistical, Mutation, Phylogeny, Probability, Stochastic Processes.
Abstract
Distances between sequences based on their k-mer frequency counts can be used to reconstruct phylogenies without first computing a sequence alignment. Past work has shown that effective use of k-mer methods depends on (1) model-based corrections to distances based on k-mers and (2) breaking long sequences into blocks to obtain repeated trials from the sequence-generating process. Good performance of such methods is based on having many high-quality blocks with many homologous sites, which can be problematic to guarantee a priori. Nature provides natural blocks of sequences into homologous regions-namely, the genes. However, directly using past work in this setting is problematic because of possible discordance between different gene trees and the underlying species tree. Using the multispecies coalescent model as a basis, we derive model-based moment formulas that involve the species divergence times and the coalescent parameters. From this setting, we prove identifiability results for the tree and branch length parameters under the Jukes-Cantor model of sequence mutations.
DOI: 10.1007/s11538-018-0399-1
PubMed: 29392644
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PubMed, to step Corpus: 000A05
- to stream PubMed, to step Curation: 000A05
- to stream PubMed, to step Checkpoint: 000529
- to stream Ncbi, to step Merge: 001D26
- to stream Ncbi, to step Curation: 001D26
- to stream Ncbi, to step Checkpoint: 001D26
- to stream Main, to step Merge: 000542
- to stream Main, to step Curation: 000539
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Identifiability of Phylogenetic Parameters from k-mer Data Under the Coalescent.</title>
<author><name sortKey="Durden, Chris" sort="Durden, Chris" uniqKey="Durden C" first="Chris" last="Durden">Chris Durden</name>
<affiliation wicri:level="2"><nlm:affiliation>Department of Mathematics, North Carolina State University, Raleigh, NC, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Mathematics, North Carolina State University, Raleigh, NC</wicri:regionArea>
<placeName><region type="state">Caroline du Nord</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Sullivant, Seth" sort="Sullivant, Seth" uniqKey="Sullivant S" first="Seth" last="Sullivant">Seth Sullivant</name>
<affiliation wicri:level="2"><nlm:affiliation>Department of Mathematics, North Carolina State University, Raleigh, NC, USA. smsulli2@ncsu.edu.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Mathematics, North Carolina State University, Raleigh, NC</wicri:regionArea>
<placeName><region type="state">Caroline du Nord</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2019">2019</date>
<idno type="RBID">pubmed:29392644</idno>
<idno type="pmid">29392644</idno>
<idno type="doi">10.1007/s11538-018-0399-1</idno>
<idno type="wicri:Area/PubMed/Corpus">000A05</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000A05</idno>
<idno type="wicri:Area/PubMed/Curation">000A05</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000A05</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000529</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000529</idno>
<idno type="wicri:Area/Ncbi/Merge">001D26</idno>
<idno type="wicri:Area/Ncbi/Curation">001D26</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001D26</idno>
<idno type="wicri:Area/Main/Merge">000542</idno>
<idno type="wicri:Area/Main/Curation">000539</idno>
<idno type="wicri:Area/Main/Exploration">000539</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Identifiability of Phylogenetic Parameters from k-mer Data Under the Coalescent.</title>
<author><name sortKey="Durden, Chris" sort="Durden, Chris" uniqKey="Durden C" first="Chris" last="Durden">Chris Durden</name>
<affiliation wicri:level="2"><nlm:affiliation>Department of Mathematics, North Carolina State University, Raleigh, NC, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Mathematics, North Carolina State University, Raleigh, NC</wicri:regionArea>
<placeName><region type="state">Caroline du Nord</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Sullivant, Seth" sort="Sullivant, Seth" uniqKey="Sullivant S" first="Seth" last="Sullivant">Seth Sullivant</name>
<affiliation wicri:level="2"><nlm:affiliation>Department of Mathematics, North Carolina State University, Raleigh, NC, USA. smsulli2@ncsu.edu.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Mathematics, North Carolina State University, Raleigh, NC</wicri:regionArea>
<placeName><region type="state">Caroline du Nord</region>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j">Bulletin of mathematical biology</title>
<idno type="eISSN">1522-9602</idno>
<imprint><date when="2019" type="published">2019</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Computational Biology</term>
<term>Evolution, Molecular</term>
<term>Markov Chains</term>
<term>Mathematical Concepts</term>
<term>Models, Genetic</term>
<term>Models, Statistical</term>
<term>Mutation</term>
<term>Phylogeny</term>
<term>Probability</term>
<term>Sequence Alignment (statistics & numerical data)</term>
<term>Stochastic Processes</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Algorithmes</term>
<term>Alignement de séquences ()</term>
<term>Biologie informatique</term>
<term>Chaines de Markov</term>
<term>Concepts mathématiques</term>
<term>Modèles génétiques</term>
<term>Modèles statistiques</term>
<term>Mutation</term>
<term>Phylogénie</term>
<term>Probabilité</term>
<term>Processus stochastiques</term>
<term>Évolution moléculaire</term>
</keywords>
<keywords scheme="MESH" qualifier="statistics & numerical data" xml:lang="en"><term>Sequence Alignment</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Computational Biology</term>
<term>Evolution, Molecular</term>
<term>Markov Chains</term>
<term>Mathematical Concepts</term>
<term>Models, Genetic</term>
<term>Models, Statistical</term>
<term>Mutation</term>
<term>Phylogeny</term>
<term>Probability</term>
<term>Stochastic Processes</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Biologie informatique</term>
<term>Chaines de Markov</term>
<term>Concepts mathématiques</term>
<term>Modèles génétiques</term>
<term>Modèles statistiques</term>
<term>Mutation</term>
<term>Phylogénie</term>
<term>Probabilité</term>
<term>Processus stochastiques</term>
<term>Évolution moléculaire</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Distances between sequences based on their k-mer frequency counts can be used to reconstruct phylogenies without first computing a sequence alignment. Past work has shown that effective use of k-mer methods depends on (1) model-based corrections to distances based on k-mers and (2) breaking long sequences into blocks to obtain repeated trials from the sequence-generating process. Good performance of such methods is based on having many high-quality blocks with many homologous sites, which can be problematic to guarantee a priori. Nature provides natural blocks of sequences into homologous regions-namely, the genes. However, directly using past work in this setting is problematic because of possible discordance between different gene trees and the underlying species tree. Using the multispecies coalescent model as a basis, we derive model-based moment formulas that involve the species divergence times and the coalescent parameters. From this setting, we prove identifiability results for the tree and branch length parameters under the Jukes-Cantor model of sequence mutations.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Caroline du Nord</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Caroline du Nord"><name sortKey="Durden, Chris" sort="Durden, Chris" uniqKey="Durden C" first="Chris" last="Durden">Chris Durden</name>
</region>
<name sortKey="Sullivant, Seth" sort="Sullivant, Seth" uniqKey="Sullivant S" first="Seth" last="Sullivant">Seth Sullivant</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000539 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000539 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= pubmed:29392644 |texte= Identifiability of Phylogenetic Parameters from k-mer Data Under the Coalescent. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:29392644" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |